Search for: All records

Creators/Authors contains: "Marculescu, Diana"

« Prev Next »

Total Resources

23

Resource Type
Conference Paper

17

Conference Proceeding

0

Dataset

0

Journal Article

6

Workshop Report

0

Availability
Full Text / Resource Available

20

Citation Only

3

Save Results
Excel (limit 2000)
CSV (limit 5000)
XML (limit 5000)

Have feedback or suggestions for a way to improve these results?
!

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

MobileTL: On-Device Transfer Learning with Inverted Residual Blocks

https://doi.org/10.1609/aaai.v37i6.25874

Chiang, Hung-Yueh ; Frumkin, Natalia ; Liang, Feng ; Marculescu, Diana ( June 2023 , Proceedings of the AAAI Conference on Artificial Intelligence)

Transfer learning on edge is challenging due to on-device limited resources. Existing work addresses this issue by training a subset of parameters or adding model patches. Developed with inference in mind, Inverted Residual Blocks (IRBs) split a convolutional layer into depthwise and pointwise convolutions, leading to more stacking layers, e.g., convolution, normalization, and activation layers. Though they are efficient for inference, IRBs require that additional activation maps are stored in memory for training weights for convolution layers and scales for normalization layers. As a result, their high memory cost prohibits training IRBs on resource-limited edge devices, and making them unsuitable in the context of transfer learning. To address this issue, we present MobileTL, a memory and computationally efficient on-device transfer learning method for models built with IRBs. MobileTL trains the shifts for internal normalization layers to avoid storing activation maps for the backward pass. Also, MobileTL approximates the backward computation of the activation layer (e.g., Hard-Swish and ReLU6) as a signed function which enables storing a binary mask instead of activation maps for the backward pass. MobileTL fine-tunes a few top blocks (close to output) rather than propagating the gradient through the whole network to reduce the computation cost. Our method reduces memory usage by 46% and 53% for MobileNetV2 and V3 IRBs, respectively. For MobileNetV3, we observe a 36% reduction in floating-point operations (FLOPs) when fine-tuning 5 blocks, while only incurring a 0.6% accuracy reduction on CIFAR10. Extensive experiments on multiple datasets demonstrate that our method is Pareto-optimal (best accuracy under given hardware constraints) compared to prior work in transfer learning for edge devices.
more » « less
Free, publicly-accessible full text available June 27, 2024
CLIP4VideoCap: Rethinking Clip for Video Captioning with Multiscale Temporal Fusion and Commonsense Knowledge

https://doi.org/10.1109/ICASSP49357.2023.10097128

Mahmud, Tanvir ; Liang, Feng ; Qing, Yaling ; Marculescu, Diana ( June 2023 , 2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP))

Free, publicly-accessible full text available June 4, 2024
Open-Vocabulary Semantic Segmentation with Mask-adapted CLIP

https://doi.org/10.1109/CVPR52729.2023.00682

Liang, Feng ; Wu, Bichen ; Dai, Xiaoliang ; Li, Kunpeng ; Zhao, Yinan ; Zhang, Hang ; Zhang, Peizhao ; Vajda, Peter ; Marculescu, Diana ( June 2023 , 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))

Free, publicly-accessible full text available June 1, 2024
DeepNVM++: Cross-Layer Modeling and Optimization Framework of Nonvolatile Memories for Deep Learning

https://doi.org/10.1109/TCAD.2021.3127148

Inci, Ahmet ; Isgenc, Mehmet Meric ; Marculescu, Diana ( October 2022 , IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems)

Full Text Available
ANT: Adapt Network Across Time for Efficient Video Processing

https://doi.org/10.1109/CVPRW56347.2022.00293

Liang, Feng ; Chin, Ting-Wu ; Zhou, Yang ; Marculescu, Diana ( June 2022 , 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW))

Full Text Available
QUIDAM: A Framework for Qu ant i zation-Aware D NN A ccelerator and M odel Co-Exploration

https://doi.org/10.1145/3555807

Inci, Ahmet ; Virupaksha, Siri Garudanagiri ; Jain, Aman ; Chin, Ting-Wu ; Thallam, Venkata Vivek ; Ding, Ruizhou ; Marculescu, Diana ( September 2022 , ACM Transactions on Embedded Computing Systems)

As the machine learning and systems communities strive to achieve higher energy-efficiency through custom deep neural network (DNN) accelerators, varied precision or quantization levels, and model compression techniques, there is a need for design space exploration frameworks that incorporate quantization-aware processing elements into the accelerator design space while having accurate and fast power, performance, and area models. In this work, we present QUIDAM , a highly parameterized quantization-aware DNN accelerator and model co-exploration framework. Our framework can facilitate future research on design space exploration of DNN accelerators for various design choices such as bit precision, processing element type, scratchpad sizes of processing elements, global buffer size, number of total processing elements, and DNN configurations. Our results show that different bit precisions and processing element types lead to significant differences in terms of performance per area and energy. Specifically, our framework identifies a wide range of design points where performance per area and energy varies more than 5 × and 35 ×, respectively. With the proposed framework, we show that lightweight processing elements achieve on par accuracy results and up to 5.7 × more performance per area and energy improvement when compared to the best INT16 based implementation. Finally, due to the efficiency of the pre-characterized power, performance, and area models, QUIDAM can speed up the design exploration process by 3-4 orders of magnitude as it removes the need for expensive synthesis and characterization of each design.
more » « less
Full Text Available
Renofeation: A Simple Transfer Learning Method for Improved Adversarial Robustness

https://doi.org/10.1109/CVPRW53098.2021.00362

Chin, Ting-Wu ; Zhang, Cha ; Marculescu, Diana ( June 2021 , 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW))
null (Ed.)
Full Text Available
Width transfer: on the (in)variance of width optimization

https://doi.org/10.1109/CVPRW53098.2021.00334

Chin, Ting-Wu ; Marculescu, Diana ; Morcos, Ari S. ( June 2021 , 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW))
null (Ed.)
Full Text Available
Edge AI: Systems Design and ML for IoT Data Analytics

https://doi.org/10.1145/3394486.3406479

Marculescu, Radu ; Marculescu, Diana ; Ogras, Umit ( July 2020 , 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining)
null (Ed.)
With the explosion in Big Data, it is often forgotten that much of the data nowadays is generated at the edge. Specifically, a major source of data is users' endpoint devices like phones, smart watches, etc., that are connected to the internet, also known as the Internet-of-Things (IoT). This "edge of data" faces several new challenges related to hardware-constraints, privacy-aware learning, and distributed learning (both training as well as inference). So what systems and machine learning algorithms can we use to generate or exploit data at the edge? Can network science help us solve machine learning (ML) problems? Can IoT-devices help people who live with some form of disability and many others benefit from health monitoring? In this tutorial, we introduce the network science and ML techniques relevant to edge computing, discuss systems for ML (e.g., model compression, quantization, HW/SW co-design, etc.) and ML for systems design (e.g., run-time resource optimization, power management for training and inference on edge devices), and illustrate their impact in addressing concrete IoT applications.
more » « less
Full Text Available
One Weight Bitwidth to Rule Them All

Chin, Ting-Wu ; Chuang, Pierce ; Chandra, Vikas ; Marculescu, Diana ( August 2020 , European Conference on Computer Vision Workshops)

Weight quantization for deep ConvNets has shown promising results for applications such as image classification and semantic segmentation and is especially important for applications where memory storage is limited. However, when aiming for quantization without accuracy degradation, different tasks may end up with different bitwidths. This creates complexity for software and hardware support and the complexity accumulates when one considers mixed-precision quantization, in which case each layer’s weights use a different bitwidth. Our key insight is that optimizing for the least bitwidth subject to no accuracy degradation is not necessarily an optimal strategy. This is because one cannot decide optimality between two bitwidths if one has smaller model size while the other has better accuracy. In this work, we take the first step to understand if some weight bitwidth is better than others by aligning all to the same model size using a width-multiplier. Under this setting, somewhat surprisingly, we show that using a single bitwidth for the whole network can achieve better accuracy compared to mixed-precision quantization targeting zero accuracy degradation when both have the same model size. In particular, our results suggest that when the number of channels becomes a target hyperparameter, a single weight bitwidth throughout the network shows superior results for model compression.
more » « less
Full Text Available

« Prev Next »